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ABSTRACT 


This  paper  presents  four  new  statistical  measures  of  monotone  relationship 

derived  from  the  concept  of  monotone  correlation.  A  nonlinear  optimization 

algorithm  is  employed  to  evaluate  these  new  measures,  as  well  as  the  monotone 

correlation,  for  ordinal  contingency  tables.  A  computer  program  to  implement 

v 

the  algorithm  is  developed,  and  is  applied  to  several  insightful  examples  to 

provide  further  understanding  of  the  usefulness  of  these  measures.^  '  _ 
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1.  Introduction  and  Statistical  Background 

Measuring  and  understanding  the  basis  for  the  association  between  two  ran¬ 
dom  variables  X  and  Y  is  extremely  important  for  the  intelligent  application  of 
statistics,  as  well  as  for  more  insight  into  the  underlying  bivariate  proba¬ 
bilistic  structures.  The  focus  of  this  paper  is  on  association  between  ordinal 
random  variables,  that  is,  random  variables  where  the  observed  values  have  a 
natural  ordering  without  necessarily  having  naturally  ascribed  numerical  values. 
For  example,  the  value--  may  arise  frcm  questionnaire  responses  based  on  the  five- 
point  scale:  strongly  disagree,  disagree,  no  opinion,  agree,  strongly  agree. 

In  measuring  association  between  two  or-' inel  variables  using  a  measure  based 
on  assigning  numerical  values  to  the  possible  data  values,  it  is  natural  to  re¬ 
quire  that  the  resultant  numerical  measure  of  association  not  depend  on  the  actual 
numerical  values  but  only  the  orderings.  This  property  can  be  described  as 
monotone  scale  invariance.  When  Pearson's  correlation  coefficient  is  used  by 
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assigning  the  values  1,  N  to  the  scale  levels,  the  resulting  measure  is  not 

monotone  scale  invariant.  For  the  five-point  scale  example,  assigning  1  to 
strongly  disagree,  ...,  5  to  strongly  agree  and  then  computing  the  Pearson  cor¬ 
relation  would  not  provide  a  monotone  invariant  measure  of  association. 

One  monotone  invariant  scale  measure  is  the  sup  correlation  p’  introduced 
by  Gebelein  (1941),  developed  further  by  Sarmanov  (1958a, b),  Renyi  (1959),  and 
Lancaster  (1969),  and  defined  by  p*(X,Y)  =  sup  p(f(X),  g(Y)),  where  the  supremum 
is  taken  over  all  Borel -measurable  functions  f,g  such  that  0  <  Var  f(X)  <  00  and 
0  <  Var  g(Y)  <  ">,  where  p  is  the  Pearson  correlation  coefficient.  For  random 
variables  (X,Y)  jointly  taking  values  on  a  finite  rectangular  lattice,  there 
are  computational,  methods  for  computing  o'  using  eigenvalue  routines  (see  Sarmanov 
and  Lancaster) .  For  continuous  random  variables  X,Y,  the  sup  correlation  is 
computable  only  in  special  '*aeo r.  where  the  joint  p.d.f.  admits  a  certain  type 
of  bivariate  orthogonal  expansion  < see  T.  ncas; or) . 

An  important  dependence  concept  between  two  random  variables  is  that  of 
complete  dependence,  introduced  by  t  ineas' or  (1965).  A  random  variable  Y  is 
said  to  be  complete! y  dependent  on  a  random  variable  X  if  there  exists  a  func¬ 
tion  g  such  that 

(1.1)  ProhfY  =  g(X) J  =  1. 

If  Y  is  completely  dependent  on  X  and  vice  versa,  then  X  and  Y  are  said  to  be 
mutually  completely  dependent.:  in  this  so  X  and  Y  are  perfectly  predictable 
from  each  other.  Observe  that  if  X  end  Y  are  mutually  complete  dependent  then 
P ' (X,Y)  =  1. 

Kimeldorf  and  Sampson  (1978)  provided  an  example  of  random  variables  X  and 
Y  which  were  mutually  lomplot.ely  dependent  and  yet  were  "almost"  stochastically 
independent.  To  circumvent  this  difrietilt,  Kimeldorf  and  Sampson  defined  Y  to 


be  monotone  increasing  (decreasing)  dependent  on  X  if  (1.1)  holds  for  a  mono¬ 
tone  increasing  (decreasing)  function  g.  Furthermore,  motivated  by  trying  to 
measure  the  degree  of  monotone  dependence,  they  defined  the  monotone  correlation 
between  random  variables  X  and  V  by 

(1.2)  P*(X,Y)  =  sup  p(f (X) ,  g ( Y ) ) , 

where  the  supremum  is  taken  over  aLI.  monotone  functions  f,g,  for  which 
0  <  Var  f(X)  <  and  0  <  Var  g(Y)  s  ro.  The  monotone  correlation  is  a  monotone 
scale  invariant  measure  of  association  and  the  maximizing  functions  (assuming 
they  exist)  for  (1.2)  are  the  "best”  monotone  scalings  for  cross  linear  predict¬ 
ability  of  X  and  Y.  (Monotone  scalings  are  order -preserving  assignments  of 
numerical  values  to  ordinal  data.)  Kimeldorf  and  Sampson  evalated  the  monotone 
correlation  in  only  the  tm  ::pe<  a  I  situations:  (i)  X  and  Y  bivariate  normal, 
in  which  case  p*  =  {«.•;,  and  f  ;  i  >  ■'  ana  V  independent,  in  which  case  p*  ■»  0. 

The  purposes  of  t  it  i  ••  paper  >  v.-ofold:  One  is  to  derive  new  measures 
associated  with  the  r  ■■noti.no  ,  u- relation  ..nd  to  study  their  applicability.  A 
second  is  to  provide  a  am;  .tat  ional  procdure  and  computer  program  to  evaluate 
the  monotone  correlat  ion  and  tneso  0  -t  ivod  me.  sures  for  the  case  when  X  and  Y 
assume  a  finite  number  or  value:.'  The  approach  is  to  find  an  equivalent  non¬ 
linear  program  and  then  cm; lev  a  mod  i f  icat  ion  of  the  optimization  algorithm  of 
Kay  (1979)  to  compute  the  "ay  i.-t  i .  in,',  va  i  tes  and  the  points  at  which  they  occur. 

In  Section  2,  we  introduce  the  concepts  of  concordancy,  discordancy,  and  iso¬ 
scaling  for  measuring  monotone  association.  The  equivalent  nonlinear  programs 
are  given  in  Section  3.  Tlu  specific  algorithm  and  the  computer  program,  which 
we  call  MON COR,  are  described  in  Section  4.  A  number  of  interesting  applications 

*The  authors  are  in  the  e  m  et  :  of  examining  procedures  for  data  from  cer¬ 
tain  continuous  distributic..  . 


and  examples  are  considered  in  Section  5.  In  Section  6,  we  discuss  how  these 
methods  might  be  used  for  scale  reduction. 
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2 .  Concordancy,  Discordancy,  and  Isoscaling 

The  concept  of  monotone  correlation  can  be  refined  by  measuring  separately 
the  strength  of  the  relationship  between  X  and  Y  in  a  positive  direction  and 
the  strength  of  the  relationship  in  a  negative  direction,  that  is,  to  measure 
separately  the  extent  of  concordancy  and  of  discordancy  between  X  and  Y.  These 
concepts  are  related  to  so-called  measures  of  disagreement  and  measures  of  dis¬ 
sociation.  If  in  (1.2),  f  and  g  are  both  restricted  to  be  increasing^  (or 
equivalently  both  decreasing),  the  resulting  measure  is  called  the  concordant 
monotone  correlation  (CMC).  When  f  is  restricted  to  be  increasing  and  g 
decreasing  (or  equivalently  f  decreasing  and  g  increasing),  we  find  it  conven¬ 
ient  to  examine  -sup  p(f(X),  g(Y)),  which  in  turn  can  be  expressed  as  -sup  p(f(X), 
-g(Y)),  where  both  f  and  g  are  increasing.  This  leads  naturally  to  defining 
the  discordant  monotone  correlation  (PMC)  by  inf  p(f(X),  g(Y)),  where  f  and  g 
are  both  restricted  to  be  increasing. 

The  DMC  and  CMC  have  natural  interpretations  as  measures  of  negative  and 
positive  association,  respectively,  for  ordinal  random  variables.  They  also  can 
be  interpreted  as  providing  bounds  for  the  correlation  between  any  arbitrary 
monotone  scalings;  specifically,  for  arbitrary  increasing  f  and  g, 

(2.1)  DMC  <  p(f(X),  g(Y))  <  CMC. 

Suppose  it  is  desired  to  impose  numeric  monotone  scalings  for  a  pair  of  new 
tests;  if  the  CMC  and  DMC  are  close,  then  by  (2.1)  it  makes  little  difference 
which  monotone  scales  are  used.  Also  if  CMC  =  DMC  =  0,  then  X  and  Y  are  inde- 


The  terms  increasing  and  decreasing  are  used  non-strictly. 


pendent  random  variables;  however,  it  is  possible  for  DMC  <  CMC  =  0  and  X  and 
Y  not  to  be  independent.  Further  note  that  If  X  and  Y  are  increasing  monotone 
dependent  then  CMC  =  1;  and  if  X  and  Y  are  decreasing  monotone  dependent,  then 
DMC  =  -1. 

Sometimes  the  situation  occurs  when  X  and  Y  should  have  the  same  scaling. 
For  example,  X  is  a  psychological  test  score  pretreatment  and  Y  is  the  score 
post-treatment  on  the  same  test.  This  leads  to  another  extension  of  monotone 
correlation,  which  we  refer  to  as  isoscaling.  If  in  (1.2)  we  restrict  f  «=  g, 
the  resulting  measure  is  called  the  isoconcordant  monotone  correlation  (ICMC) . 
Analogous  to  the  DMC  definition,  the  isodiscordant  monotone  correlation 
(IDMC)  is  given  by  inf  p(f(X),  g(Y)),  where  f  =  g.  Obviously  isoscaling  is 
not  practically  appropriate  when  X  and  Y  linve  essentially  different  ranges  of 
values. 

If  X  and  Y  are  exchangeable  ord  Inn •  random  variables  it  might  be  conjec¬ 
tured  due  to  all  the  symmetries  involved  that  TOMC  =  CMC  (and  IDMC  =  DMC). 
However,  as  is  shown  in  Section  5,  'bis  is  surprising] y  not  the  case. 

The  actual  functions  that  maximize  the  correlations  (assuming  they  exist) 
are  of  importance  in  developing  monotone  scales.  We.  refer  to  such  functions 
generically  as  monotone  variables  with  their  specific  interpretation  depending 
on  which  monotone  correlation  measure  is  used  in  their  derivation. 


3.  Program  Formulation 

The  preceding  extensions  of  the  monotone  correlation  are  applicable  to  all 
suitable  pairs  of  random  variables,  continuous  or  discrete.  We  now  focus  on  the 
case  where  X  and  Y  jointly  take  on  a  finite  number  of  values  (a^,  b^),  i  ■  1, 


.,  I,  j  =  1,  ....  J  and  Prob(X  =  a^,  Y  =  b  )  =  . 


Then 
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(3.1)  CMC  =  max 


I  J  I  J 

*  *  f(a.)p.,g(b.)  -  (  £  p  f(a  ))(  £  p  ,g(b.)) 

i=l  i=j  1  J  i-i  1  j.i  J  j _ 

(  £  f2(a  )p  -  (  £  p  f(a,))2)*(  £  g2(b.)p  -  (  £  p  .g(b.))2)h 

1=1  1=1  1  1  j=l  3  *3  j=l  *3  3 


subject  to  f  and  g  being  increasing  functions  for  which  the  denominator  in  (3.1) 

J  I 


is  non-zero  and  where  p.  =  £  p..  and  p  =  £  p... 

im  j=l  13  *3  i=l  i3 

by  x^  1=1,  ....  I  and  g(bj  by  y  ,  j  =  1,  ...»  J, 

ulated  as 


Denote  the  values  f(ap 
so  that  (3.1)  can  be  reform- 


(3.2)  CMC  =  max 

subject  to:  x^  <_  . . .  _<  x^ 


x’Py  -  (x'Pe) (y'P'e) 

*V>  ‘W t  *\j  r\Aj  'Xj  'X*  ^ 

(£xt2p  -  (x,Pe)2)*5(£y  2p  -  (y'P'e)2)*5 

^  ^  ^  ^  %  'yj 


x  4  C  e,  y  4  c  e 

'V/  ^  ^ f \j 


where  x  =  (x  ,  ....  x  )',  y  =  (y  ,  y  )',  P  =  (p..)  and  e  =  (1,  ...,  1)'. 

1  -v  x  J  <v  11  a, 

Thus  to  compute  the  CMC  all  that  is  required  is  the  matrix  P  of  probabilities. 

'X, 

For  instance,  the  values  a^,  ...»  a^  could  be  the  five-point  scale  strongly  dis¬ 
agree,  ...,  strongly  agree.  The  resultant  monotone  variable  x  would  then  pro- 

% 

vide  a  numeric  scale  assigning  to  strongly  disagree,  ...,  to  strongly 
agree. 

Analogous  formulations  of  (3.2)  can  be  given  for  ICMC,  DMC,  and  IDMC.  Again 
the  ICMC  and  IDMC  are  not  defined  when  I  4  J. 

When  reporting  the  monotone  variables,  we  standardize  them  without  loss  of 
generality  so  that  in  (3.2),  for  example,  x^  =  y^  =  0  and  xj  *  Yj  =  1* 

Until  this  point,  the  CMC,  etc.,  have  been  defined  as  population  quantities. 
For  data  from  finite  discrete  distributions,  the  joint  probabilities  can  be 
estimated  from  the  data  viewed  in  ordinal  contingency  table  form.  Then  the  CMC 
can  be  evaluated  based  upon  the  estimated  probabilities.  In  this  situation,  the 


CMC  can  either  be  viewed  as  an  estimate  of  the  "true"  CMC  or  be  viewed  as  a 


measure  of  monotone  association  for  the  ordinal  contingency  table. 

4.  Optimization  Approach  and  MONCOR  Description 

The  nonlinear  programming  problem  (3.2)  involves  the  optimization  of  a  non¬ 
linear  fractional  form  subject  to  linear  constraints.  Note  that  if  it  were  not 
for  the  monotone  constraints,  (3.2)  would  be  an  eigenvalue  problem.  The  objec¬ 
tive  function  in  (3.2)  is  not  pseudoconcave.  To  see  this,  consider  the  simple 

2  2  2 

case  of  evaluating  the  ICMC  =  max(x'Px  -  (x'Pe)  )/(Ex^  p  -  (x'Pe)  )  for  a  sym- 

r\/\j  %  W  r\>  /V\j 

metric  probability  matrix  P.  While  both  numerator  and  denominator  are  continu- 

'U 

2  2 

ously  differentiable  on  the  feasible  region,  and  (£x^  p.#  -  (x’Pe)  )  is  a 

2 

positive  convex  function  of  x,  (x'Px  -  x'Pe)  would  have  to  be  nonnegative  and 

%  *\j  'VXi  *\j  'VU 

concave  for  pseudoconcavity  (see  Avriel  (1976)).  This  latter  condition  does  not 

hold  in  general  for  symmetric  P.  Hence,  in  general,  the  CMC,  and  ICMC,  DMC  and 

<v» 

IDMC,  will  involve  the  optimization  of  a  function  with  local  optima.  Although 
much  work  is  presently  being  done  in  the  area  of  global  optimization  (see,  for 
example,  Dixon  and  Szego  (1975),  (1978)),  we  follow  the  standard  procedure  of 
using  various  starting  points,  computing  the  optima,  and  then  choosing  the  best 
result  based  upon  the  different  starting  points. 

Note  that  since  correlation  is  unique  in  x  and  y  only  up  to  location  and 
scale  change,  we  could  express  (3.2)  as 


(4.1)  Maximize  x'Py  , 


subject  to  ^1pi#  «  0,  EyjP.j  “  0,  Ex^p^  -  1,  Ey^  p^  *  1,  xx  <  x2  <  . . .  <Xj, 
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The  formulation  of  (4.1),  because  of  its  nonlinear  constraints,  is  not  a  desir¬ 
able  formulation  since  complexity  in  the  objective  function  is  much  easier  to 
deal  with  than  complexity  in  the  constraints.  The  constraints  x  4  c  e  and  y  4  c  e 

'Xj  'Xj  'Xj  'Xj 

in  (3.2)  are  not  computationally  implementable  in  continuous  variables.  However, 
without  loss  of  generality,  we  eliminate  those  constraints  by  fixing  x^  and  y^ 
at  zero  and  x^  and  y^  at  one. 

Specifically,  the  computation  of  the  CMC  (DMC)  involves  optimizing  a  non¬ 
concave  (nonconvex)  function  in  I  +  J  -  4  independent  variables  subject  to 
monotonicity  constraints.  (The  ICMC  and  IDMC  involve  1-2  independent  vari¬ 
ables.)  Since  P  is  envisioned  to  be  not  much  larger  than  10  x  10,  a  modified 
% 

Newton  method  was  considered  desirable  because  it  should  converge  in  a  small 
number  of  iterations.  QRMNEW  (see  May),  an  optimization  method  not  requiring 
analytical  derivatives,  was  employed  because  of  its  ease  of  adaptation  and  compu¬ 
tational  use. 

QRMNEW  is  a  hybrid  local  variations-modif ied  Newton  method,  using  orthogonal 

(QR)  matrix  factorization  to  derive  a  representation  for  the  locally  feasible 

region.  It  has  been  proven  globally  convergent  to  a  point  satisfying  both  first 

and  second  order  necessary  optimality  conditions,  so  that  any  solution  generated 

is  at  least  a  local  optimum.  Superlinear  and  order  2  convergence  rates  can  be 

Ic 

established  under  somewhat  stronger  conditions.  Denote  by  {(x,y)  }  the  iterative 
sequence  of  points  generated  by  the  algorithm.  In  general,  because  of  the  lack 
of  pseudoconcavity  (pseudoconvexity)  for  the  CMC  and  ICMC  (DMC  and  IDMC),  an 
iterate  (x,y)  will  usually  be  in  a  region  not  locally  concave  (convex).  The 
algorithm  does  have  a  rather  sophisticated  method  for  dealing  with  the  indefinite 
projected  matrix  of  second  derivatives  implied  by  the  lack  of  local  concavity  (con¬ 
vexity)  . 
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While  a  complete  mathematical  description  of  QRMNEW  is  given  by  May,  a 
general  iteration  is  illustrated  in  Figure  4.1,  to  show  the  underlying  logic. 

Given  the  current  point  x  and  a  stepsize  s  >  0,  the  constraints,  if  any,  that 
are  satisfied  exactly  at  x  are  used  to  generate,  via  orthogonal  matrix  factori¬ 
zation,  a  set  of  n  coordinate  directions.  If  u  constraints  are  active,  u  direc¬ 
tions  lie  in  the  manifold  determined  by  those  constraints,  and  the  remaining 
n-u  directions  are  determined  by  computing  a  generalized  inverse  and  are 
orthogonal  to  that  manifold.  (If  no  constraints  are  active,  the  standard  Car¬ 
tesian  coordinate  system  is  used.)  The  objective  function  is  then  evaluated  at 

2  points  along  each  of  these  coordinate  axes.  For  example,  in  Figure  4.1,  two 

3 

constraints  are  active  in  R  .  Three  directions  are  generated:  d^,  which  lies 
in  the  manifold,  so  that  movement  away  from  x  in  either  +d^  or  -d  is  feasible, 
where  going  along  +d^  leads  to  infeasibility,  and  d^»  which  is  analogous  to 
cl,.  The  function  is  evaluated  at  points  1  through  6,  yielding  second  order 
approximations  to  first  and  second  partial  directional  derivatives  along  d^,  d^, 
and  d^.  Assume  a  maximum  is  being  sought,  e.g.,  computing  the  CMC  or  ICMC,  and 
that  the  first  derivatives  along  d^,  d^,  and  d^  are,  respectively,  positive, 
positive,  and  negative.  Then  the  objective  function  cannot  be  increased  by  move¬ 
ment  along  d^,  so  that  it  is  dropped  from  consideration.  The  function  is  then 
evaluated  at  point  7,  which  is  needed  to  approximate  the  second  mixed  partial 
directional  derivative  with  respect  to  d^  and  d^.  A  Newtvn-type  search  direc¬ 
tion  is  computed  and  searched,  and  the  algorithm  moves  to  the  best  of  the  points 
found  by  the  Newton  search  procedure  and  points  1  through  7. 

MONCOR  is  an  interactive  package  designed  to  analyze  probability  matrices,  P, 

a, 

of  dimension  less  than  or  equal  to  20  x  20.  The  user  may  input  a  single  starting 
point  for  an  optimization  run,  or  allow  the  program  to  generate  its  own  multiple 
starting  points.  In  both  cases,  the  constraint  set  corresponding  to  the 


Figure  4,1 


An  Iteration  of  ORMNEW 


Constraint  ?/2 


7 


as 


Constraint  ffl 


correlation  measure  requested  is  generated  internally,  and  QRMNEW  is  used  to  com 

pute  an  optimum.  Additionally,  two  different  strategies  are  employed  in 
seeking  an  optimal  solution.  Numerical  experience  indicates  that  optimum  values 
sometimes  lie  at  monotone  extreme  points,  i.e.,  points  where  all  the  x  and  y 
entries  are  either  zero  or  one.  This  appears  to  be  especially  the  case  when 
computing  the  DMC  or  IDMC  for  a  matrix  with  highly  positive  CMC,  and  vice-versa. 
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In  fact,  for  certain  cases,  the  optima  for  all  four  monotone  correlation  measures 
might  be  achieved  only  at  such  points.  Additionally,  because  non-optlmal  mono¬ 
tone  extreme  points  can  be  local  optima  (satisfying  the  Karush-Kuhn-Tucker 
second  order  necessary  optimality  conditions  (see  Kuhn  and  Tucker  (1951))),  QMflEW 
starting  from  a  random  point  might  well  be  trapped  by  these  local  optima.  Note 
that  for  an  I  x  J  matrix,  there  are  only  (I  -  1)(J  -  1)  monotone  extreme  points 
to  consider  for  the  CMC  and  DMC  ((n  -  1)  for  ICMC  and  IDMC,  assuming  I  =  J  =  n) . 
Hence,  in  order  to  avoid  stopping  at  a  local  optima  when  the  global  optima  is  a 
monotone  extreme  point,  MONCOR  evaluates  the  correlation  of  all  monotone  extreme 
points.  Moreover,  MONCOR  generates  ten  random  monotone  points,  with  coordinates 
selected  on  (0,1),  using  the  DEC  random  number  generator  (see  Payne,  Kabung,  and 
Bogyo  (1969)),  and  calls  QRMNEW  to  compute  an  optimum  starting  from  each  of  them. 
The  user  may  select  to  see  only  the  final  output,  or  an  iteration-by-iteration 
output  of  the  monotone  correlations  and  monotone  variables. 

5 .  Appl icat ions 

By  means  of  the.  algorithm  and  the  MONCOR  program,  we  now  compute  the  CMC, 

etc.,  for  several  insightful  examples.  Let  (X,Y)  be  a  discrete  bivariate  random 

vector  taking  values  in  a  6  x  6  lattice:  {a.,  ...,  a,}  x  {b, ,  ...,  b,}.  Further 

i  6  1  6 

suppose  Prob(X  =  a^)  =  1/6,  for  all  i,  and  Prob  (Y  =  b^)  =  1/6,  for  all  j;  i.e., 

X  and  Y  have  uniform  marginals.  If  X  and  Y  are  monotone  increasing  dependent 
then  P  =  (1/6)1,  where  P  *  {Prob(X  =  i,  Y  =  j)},  and  I  is  the  6x6  identity  matrix 

*\j  ^0  'V 

if  X  and  Y  are  monotone  decreasing  dependent  then  P  =  (1/6)1*,  where  I*  ■  {6(i  + 

^  <\j 

j  -  7)},  and  v(x)  is  1  if  x  =  0,  and  is  0,  otherwise.  Now  consider  a  one-para¬ 
meter  family  of  distributions  indexed  by  9,  i.e.,  for  a  given  9,  Prob(X  *  i, 

Y  =  j)  is  the  (i,j)th  element  of  Pg,  where 

'V 

P9  =  ("HP)(1/6)I  +  <^Y^)0/6)I*, 

%  *\t  'X, 


(5.1) 


where  -1  <  6  <  1.  Note  that  X  and  Y  still  have  uniform  marginal  distributions 


for  all  9.  For  0  =  1(-1),  corresponds  to  the  most  monotone  increasing  (de- 

'V 

creasing)  dependent  case;  and  intermediate  values  of  9  describe  varying  degrees 
of  mixtures  of  the  two  dependent  extremes.  In  Figure  5.1,  we  graph  the  values 
of  the  CMC,  ICMC,  DMC  and  IDMC  as  functions  of  0  for  P  given  by  (5.1).  (More- 
over,  because  the  support  of  X  and  Y  is  two  disjunct  pieces  in  the  sense  of 
Lancaster,  it  follows  that  the  sup  correlation  p ’  is  1  for  all  6  in  (5.1).) 

Figure  5.1 

CMC,  DMC,  ICMC,  IDMC  vs.  9 


P  given  by  (5.1) 
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From  Figure  5.1,  it  can  be  observed  that  the  CMC,  DMC,  ICMC,  and  IDMC  are 
all  linear  functions  of  0.  Moreover,  the  IDMC  =  0.  The  CMC  and  ICMC  coincide, 
and  the  CMC  at  0  is  equal  to  the  negative  of  the  DMC  evaluated  at  -0. 

Now  consider  (X,Y)  defined  on  a  3  x  3  lattice  with 


so  that,  for  example,  Prob(X  =  a^,  Y  =  b^)  =  1/4.  Note  that  P  is  a  symmetric 

% 

probability  matrix,  so  that  X  and  Y  are  exchangeable  random  variables.  It  fol¬ 
lows  in  this  case  by  direct  computation  or  by  use  of  MONCOR,  that  the  ICMC  is 
0,  and  the  monotone  variables  for  X  and  Y  are  (0,  .5,  1)’.  However,  the  CMC  is 
1/3,  and  the  monotone  variables  for  X  and  Y,  respectively,  are  either  (0,  1,  1) * 
and  (0,  0,  ])'  or  (n,  0,  1)'  and  (0,  1,  1)'.  Thus  (5.2)  provides  an  example  of 
exchangeable  random  variables  where  ICMC  f  CMC. 

We  now  consider  applying  these  monotone  measures  to  an  actual  data  example, 
taken  from  Bishop,  Holland,  and  Fienberg  (1975,  p.  100),  which  in  turn  was 
adapted  from  Glass  and  Hall  (1954,  p.  183).  These  data  are  given  in  Table  5.1. 

Because  the  same  categories  are  used  to  measure  father's  and  son's  occu¬ 
pational  status,  it  is  appropriate  to  use  isoscaling.  The  ICMC,  IDMC  and  the 
associated  monotone  variables  were  computed  by  the  MONCOR  program  based  on 
the  empirical  probability  matrix  specified  by  Table  5.1.  The  values  of  the 
ICMC  and  TDMC  as  well  as  the  monotone  variables  are  presented  in  Table  5.2. 
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Table  5.1 

British  Mobility  Data 
(3,500  Father-Son  Data  Values) 

Son^s_Occu£ational  Status'*" 


Father's  Occupational 
Status’*" 


SI 

S2 

S3 

S4 

S5 

SI 

50 

45 

8 

18 

8 

S2 

28 

174 

84 

154 

55 

S2 

11 

78 

110 

223 

96 

S4 

14 

150 

185 

714 

447 

S5 

3 

42 

72 

320 

411 

i 


i 


!• 

I 


•i 

\ 

I 

i 


Table  5.2 

I.CMC,  I  PMC  and  Monotone  Variables  for 
Br ftish  Mobility  Data 


Measure 

Value  of  Measure 

Monotone 

Variable  Values 

ICMC 

.496 

0.  .627 

.842 

.923 

1.0 

IDMC 

.242 

0.  0. 

0. 

0. 

1.0 

The  analogous  version  of  (2.1)  for  isoscaling,  namely  IDMC  _<  p[f(X),  g(Y)) 
<  ICMC,  shows  that  regardless  of  the  assignment  of  numerical  values  to  the  five 
ordinal  categories,  the  resultant  correlation  is  between  .242  and  .496. 


Status  SI  is  professional,  and  high  administrative;  status  S2  is  manager¬ 
ial,  executive,  and  higher  grade  supervisory;  status  S3  is  lower  grade  super¬ 
visory;  status  S4  is  skilled  manual;  and  status  S5  is  semi-skilled  and  unskilled 
manual. 


One  important  use.  of  monotone  variable  theory  is  the  ability  to  develop 
me  mingful  scales  foL  onlina  iriables.  ’ire  aij'le,  si  ^ose  lilt  five-point 
scale  response  to  some  question  is  elicited  pre-  and  post-  some  experimental 
intervention.  Through  the  use  of  the  ICMC,  we  can  provide  a  numerical  scale  for 
this  five  point  response;  this  numerical  scale  has  the  property  that  among  all 
possible  such  ordinal  scalings,  the  post-response  for  this  scaling  is  most 
linearly  predictable  from  the  pre-response.  In  Table  5.2  the  row  correspond¬ 
ing  to  ICMC  provides  this  scaling  for  the  occupational  status  variable  based 
on  the  British  mobility  data.  Specifically,  the  numerical  values  for  SI,  S2, 

S3,  ? ,  and  S5  are  0.,  .627,  .842,  .923,  and  1.0,  respectively. 

Often,  the  number  of  distinct  values  for  the  numerically  scaled  variables 

is  sul-rt. anti'il  1  .  Ira;-,  tb.m  the  number  of  values  for  the  original  ordinal  vari¬ 
ables.  Tbir  t  .-fi-ir  t  i“P  rrcir.  ;•  vhor  "he  optimizing  f,g  in  (1.2)  are  not  one-to- 

f  i.v  <*.  r,,  illustrate  tlWs  phenonerun,  we  consider  the  following  example. 

A  *  0  "’it  i  in  ao’i'ra  tod  where  each  on  try  if;  a  randomly  generated  number 

on  V  1 ) ,  •  h  i'/  rr,-  fnd« '■prr'.dentl v  of  the  other  entries.  In  order  to 

)p.i",T  i.o  a  ' !  i  >  •  h  i  !  •’  •'O'ifii'o  dependent-  distribution,  the  constant  2  was  added 
to  each  diagonal  and  the  entire  matrix  scaled  so  as  to  add  to  one.  The  resultant 
matrix  is  gi’-en  in  Table  6.1. 

The  CMC  rnr  the  matrix  in  Table  6.1  is  0.443,  and  the  monotone  variables  for 
a^,  ....  a^,  and  b^,  ...,  b^,  are,  respectively,  (.000,  .461,  .461,  .461,  .872, 
.872,  .872,  .872,  .873,  1.000)’  and  (.000,  .537,  .541,  .541,  .842,  .842,  .842,  .842, 
.842,  1.0001'.  Mote  that  while  the  original  variables  each  had  10  separate  values, 
there  are  only  five  distinct  monotonely  scaled  values  for  X  and  five  for  Y.  While 
tiiir-  r.c--  reduction  phenomenon  is  based  upon  empirical,  observation,  it  is  clear 
that  it  has  great  potential  value  in  deriving  simplified  scales  for  large  data 
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Table  6.1 

Random  10  x  10  Probability  Matrix 


bl 

b2 

b3 

b4 

b5 

b6 

b7 

b8 

b9 

b10 

al 

.0331 

.0111 

.0092 

.0049 

.0016 

.0028 

.0009 

.0108 

.0096 

.0007 

a2 

.0101 

.0361 

.0057 

.0081 

.0133 

.0062 

.0121 

.0066 

.0003 

.0020 

a3 

.0102 

.0059 

.0347 

.0027 

.0055 

.0020 

.0104 

.0046 

.0069 

.0056 

a4 

.0144 

.0018 

.0065 

.0342 

.0006 

.0071 

.0055 

.0066 

.0084 

.0113 

a5 

.0006 

.0016 

.0087 

.0132 

.0435 

.0061 

.0100 

.0046 

.0044 

.0053 

a6 

.0022 

.0035 

.0151 

.0015 

.0056 

.0427 

.0062 

.0035 

.0089 

.0125 

a7 

.0002 

.0084 

.0026 

.0020 

.0005 

.0086 

.0387 

.0007 

.0034 

.0111 

a8 

.0084 

.0100 

.0079 

.0036 

.0100 

.0128 

.0044 

.0303 

.0121 

.0065 

a9 

.0028 

.007  9 

.0141 

.0008 

.0133 

.0077 

.0064 

.0139 

.0402 

.0068 

aio 

.0009 

.0149 

.0042 

.0108 

.0022 

.0144 

.0130 

.0151 

.0146 

.0438 

7.  Program  Availability 

The  MONCOR  program,  written  in  FORTRAN,  is  available  for  distribution.  For 


specific  details  contact  Professor  Jerrold  May,  Graduate  School  of  Business,  Uni¬ 
versity  of  Pittsburgh,  Pittsburgh,  PA  15260. 
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